Skip to content

Comments

v0.1.3 release. GLM-5 lands!#19

Merged
lcy-seso merged 1 commit intotile-ai:mainfrom
lcy-seso:v0.1.3
Feb 14, 2026
Merged

v0.1.3 release. GLM-5 lands!#19
lcy-seso merged 1 commit intotile-ai:mainfrom
lcy-seso:v0.1.3

Conversation

@lcy-seso
Copy link
Collaborator

v0.1.3 release. GLM-5 lands.

Copilot AI review requested due to automatic review settings February 14, 2026 11:23
Co-authored-by: Guojun Chen <gjchen@live.com>
Co-authored-by: Yuxiao Guo <yuxiao.guo@outlook.com>
Co-authored-by: Yuqing Xia <Xiayuqing0622@outlook.com>
Co-authored-by: Jilong Xue <xuejilong@gmail.com>
Co-authored-by: Lingxiao Ma <xysmlx@gmail.com>
Co-authored-by: Liu Heng <18821707235@163.com>
Co-authored-by: Zheng QiHang <zhengqihang0915@qq.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR cuts a v0.1.3 release that adds GLM-5 support alongside DeepSeek-V3.2, introduces a small benchmarking suite, and refactors/extends the Python-side TileRT model/ops wrappers to support the new kernels and weight formats.

Changes:

  • Add GLM-5 model args + integrate GLM-5 dispatch paths across many DeepSeek v3.2 ops/modules (shared implementation via shape/dim-based dispatch).
  • Introduce a benchmark harness and update the generation CLI to support model selection and sampling options.
  • Refactor core model utilities/base classes and add new shared model primitives (python/models/common.py).

Reviewed changes

Copilot reviewed 63 out of 69 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
python/tilert_init.py Simplifies init/force-init wrappers to call ops without placeholder tensors.
python/profiler/init.py Adds profiler package docstring.
python/models/utils.py Fixes conditional to guard rope correction on factor is not None.
python/models/preprocess/init.py Removes preprocess package exports (WeightLoader no longer re-exported).
python/models/glm_5/params.py Adds GLM-5 params module stub/docstring.
python/models/glm_5/model_args.py Introduces ModelArgsGLM5 with GLM-5 hyperparameters.
python/models/glm_5/init.py Package init (empty in diff).
python/models/deepseek_v3_2/temp_var_indices.py Adds named temp-var indices + validation helper.
python/models/deepseek_v3_2/refs/kernel.py Adds reference kernels (tilelang/triton) for fp8 ops/quant/dequant.
python/models/deepseek_v3_2/refs/init.py Exposes reference kernel helpers via package exports.
python/models/deepseek_v3_2/ops/up_gate_silu.py Adds TileRT op wrapper.
python/models/deepseek_v3_2/ops/unproj_o_allreduce.py Adds unified DS/GLM5 unproj+allreduce module + weight converters.
python/models/deepseek_v3_2/ops/topk.py Adds top-k wrappers + TopK nn.Module wrapper.
python/models/deepseek_v3_2/ops/top_p.py Adds unified top-p dispatch for DS/GLM5.
python/models/deepseek_v3_2/ops/top1_allreduce.py Adds top1 allreduce wrapper.
python/models/deepseek_v3_2/ops/sparse_index.py Adds sparse index + sparse index topk wrappers (GLM5 paths).
python/models/deepseek_v3_2/ops/rotate.py Adds unified rotate op + Rotate module.
python/models/deepseek_v3_2/ops/rmsnorm_up_gate_silu.py Adds fused RMSNorm+UpGateSiLU module + algorithm selection.
python/models/deepseek_v3_2/ops/rmsnorm_quant.py Adds unified RMSNorm(+optional quant) wrapper for DS/GLM5.
python/models/deepseek_v3_2/ops/rmsnorm_proj_top1.py Adds RMSNorm+proj+top1 wrapper.
python/models/deepseek_v3_2/ops/rmsnorm_kv.py Adds KV RMSNorm module.
python/models/deepseek_v3_2/ops/rmsnorm_head_proj.py Adds head projection module + GLM5/DS dispatch.
python/models/deepseek_v3_2/ops/rmsnorm_expert_proj.py Adds expert-projection module wrapper.
python/models/deepseek_v3_2/ops/qkv_rope.py Adds unified QKV RoPE wrapper + module.
python/models/deepseek_v3_2/ops/projx_wis.py Adds projection wrapper/module for indexer score weights.
python/models/deepseek_v3_2/ops/projq_wqb.py Adds Q projection wrapper/module for KV-LoRA (GLM5 support).
python/models/deepseek_v3_2/ops/projo_wkvb.py Adds O projection wrapper/module for KV-LoRA (GLM5 support).
python/models/deepseek_v3_2/ops/layernorm_rope_rotate.py Adds LayerNorm+RoPE+rotate wrapper/module.
python/models/deepseek_v3_2/ops/head_proj.py Adds head projection wrapper.
python/models/deepseek_v3_2/ops/flash_sparse_mla.py Adds flash sparse MLA wrapper + combine module.
python/models/deepseek_v3_2/ops/expert_select.py Adds expert select wrappers (two-stage DS vs one-stage GLM5).
python/models/deepseek_v3_2/ops/eh_proj_allreduce.py Adds EH proj + allreduce module with DS/GLM5 dispatch.
python/models/deepseek_v3_2/ops/down_allreduce.py Adds down+allreduce wrappers + module with DS/GLM5 dispatch.
python/models/deepseek_v3_2/ops/init.py Exposes deepseek_v3_2 ops package API.
python/models/deepseek_v3_2/modules/mtp_preprocess.py Adds MTP preprocess module + weight converter.
python/models/deepseek_v3_2/modules/mtp.py Adds MTP module wiring (preprocess + moe + head).
python/models/deepseek_v3_2/modules/moe.py Adds MoE module wiring + GLM5 algorithm selection.
python/models/deepseek_v3_2/modules/mlp.py Adds MLP module wiring + GLM5 algorithm selection.
python/models/deepseek_v3_2/modules/mla.py Adds MLA module wiring + GLM5 algorithm selection + cache vars.
python/models/deepseek_v3_2/modules/dsa.py Refactors DSA module temp-var allocation using named indices.
python/models/deepseek_v3_2/modules/init.py Adds modules package exports list.
python/models/deepseek_v3_2/model_args.py Updates DS v3.2 defaults (dtype/seq-len/etc) and adds arch_name, kv_cache_pad, quant params.
python/models/deepseek_v3_2/dsa_mtp_e2e_show_hands.py Removes legacy “show hands” E2E module.
python/models/common.py Adds shared linear/RMSNorm/parallel layers + fp8 reference-kernel usage.
python/models/base.py Refactors base module + adds SerializableTileRTModule + converter base class.
python/generate.py Adds model selection + sampling args and integrates benchmark modes.
python/benchmark/short_prompt.py Adds short-prompt benchmark.
python/benchmark/long_prompt.py Adds long-prompt benchmark.
python/benchmark/coding_prompt.py Adds coding-prompt benchmark.
python/benchmark/init.py Adds benchmark utilities/types + markdown table printer.
python/init.py Removes ShowHandsGenerator export from top-level package.
assets/perf.png Asset changes for docs/benchmarks (binary).
assets/generate.gif Asset changes for docs/benchmarks (binary).
assets/glm5-mtp.png Adds GLM5 MTP benchmark figure for README.
assets/glm5-without-mtp.png Adds GLM5 non-MTP benchmark figure for README.
assets/logo.png Adds/updates logo asset for README/site.
README.md Updates release notes + GLM5 benchmarking figures + new weight conversion workflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@xysmlx xysmlx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@lcy-seso lcy-seso merged commit deeedc3 into tile-ai:main Feb 14, 2026
8 checks passed
@lcy-seso lcy-seso deleted the v0.1.3 branch February 14, 2026 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants